Application of Local Bidirectional Language Model to Error Correction in Polish Medical Speech Recognition

نویسنده

  • Jerzy SAS
چکیده

In the paper, the method of short word deletion errors correction in automatic speech recognition is described. Short word deletion errors appear to be a frequent error type in Polish speech recognition. The proposed speech recognition process consists of two stages. At the first stage the utterance is recognized by a typical speech recognizer based on forward bigram language model. At the second stage the word sequence recognized by the first stage recognizer is analyzed and such pairs of adjacent words in the recognized sequence are localized, which are likely to be separated by a short word like conjunction or preposition. The probability of short word appearance in context of found words is evaluated using centered trigrams and backward bigram language model for short words prone to deletion. The set of probabilistic language properties used to correct deletions is called here Local Bidirectional Language Model (in contrast to purely forward or backward model used typically in speech recognition). The decision of short word insertion is based on comparison of deletion error probability of the first stage recognizer and the error probability of the decision based only on centered trigrams and backward model. Despite its simplicity, the method proved to be effective in correcting deletion errors of most frequently appearing Polish prepositions. The method was tested in application to medical spoken reports recognition, where the overall short word deletion error rate was reduced by almost 45%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Voice-based Age and Gender Recognition using Training Generative Sparse Model

Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...

متن کامل

Speech recognition error correction using maximum entropy language model

A speech interface is often required in many application environments, such as telephone-based information retrieval, car navigation systems, and user-friendly interfaces, but the low speech recognition rate makes it difficult to extend its application to new fields. We propose a domain adaptation technique via error correction with a maximum entropy language model, which is a general and elega...

متن کامل

Turbo Processing for Speech Recognition

Speech recognition is a classic example of a human/ machine interface, typifying many of the difficulties and opportunities of human/machine interaction. In this paper, speech recognition is used as an example of applying turbo processing principles to the general problem of human/machine interface. Speech recognizers frequently involve a model representing phonemic information at a local level...

متن کامل

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

Interactive Speech Translation in the Diplomat Project 2 Multi-engine Machine Translation

The DIPLOMAT rapid-deployment speech translation system is intended to allow na ve users to communicate across a language barrier, without strong domain restrictions, despite the error-prone nature of current speech and translation technologies. Achieving this ambitious goal depends in large part on allowing the users to interactively correct recognition and translation errors. We brieey presen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010